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Abstract 

Interest in the study of in-host microbial communities has increased in recent years due to our im¬ 
proved understanding of the communities’ significant role in host health. As a result, the ability to model 
these communities using differential equations, for example, and analyze the results has become increas¬ 
ingly relevant. The size of the models and limitations in data collection among many other considerations 
require that we develop new parameter estimation methods to address the challenges that arise when 
using traditional parameter estimation methods for models of these in-host microbial communities. In 
this work, we present the challenges that appear when applying traditional parameter estimation tech¬ 
niques to differential equation models of microbial communities, and we provide an original, alternative 
method to those techniques. We show the derivation of our method and how our method avoids the lim¬ 
itations of traditional techniques while including additional benefits. We also provide simulation studies 
to demonstrate our method’s viability, the application of our method to a model of intestinal microbial 
communities to demonstrate the insights that can be gained from our method, and sample code to give 
readers the opportunity to apply our method to their own research. 

Keywords. Parameter estimation, differential equations, Lotka-Volterra models, intestinal microbiota. 


1 Introduction 

The composition of in-host microbial communities (microbiota) plays a significant role in host health, and a 
better understanding of the microbiota’s role in a host’s transition from health to disease or vice versa could 
lead to novel medical treatments. One of the first steps toward this understanding is exploring the interaction 
dynamics of the microbes that compose the microbiota, which often are modeled using systems of differen¬ 
tial equations. The size and complexity of microbiota dynamics, not to mention the difficulties involved in 
collecting sufficient data, makes this type of modeling exceedingly challenging. The inefficiencies and lack of 
robustness displayed by traditional parameter estimation techniques, such as single-shooting methods, only 
add to the challenge. Building on previously developed alternatives to traditional methods, we establish a 
novel parameter estimation method by approximating the model’s state variables with spline functions and 
relaxing any known hard constraints on the model to achieve a new problem statement. This approach de¬ 
fines a “nearby” parameter estimation problem that can be solved using robust numerical methods. We first 
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verify our method on simulation studies using data generated by given generalized Lotka-Volterra equations. 
We then employ our method on data from an intestinal microbiota experiment, and we compare our results 
to a published parameterized model that uses the same data. In the simulation studies, we recover both 
the parameters and data, and in the comparison to the published intestinal microbiota model, our method 
exhibits superior data recovery. Our intestinal microbiota model also captures experimentally confirmed 
interactions. Based on these results, we conclude our robust method successfully parameterizes microbiota 
dynamics when modeled by a system of differential equations. This parameterization can lead to both qual¬ 
itative and quantitative insights into the microbiota and direct future experiments to further improve the 
understanding of its dynamics. 


This paper is structured as follows. In Sections o we provide biological background of the dynamics of 
microbial communities. A mathematical model of the interactions of microbial communities is introduced 
in Section |1.2| The standard setup for parameter estimation for ordinary differential equations is given 
in Section |2.1| followed by Section |2.2| on our novel parameter estimation method continuous shooting. In 
Sectionj^we provide first a parameter estimation study for simulation data (Section 3.1) and secondly apply 
our continuous shooting method to a given data set on intestinal microbiota (Section [3.2[ ). Section|^discusses 
our new methods and the results and implications from the microbiota data. 

Note, to increase readability and quickly deliver our main points, we moved various details on our method, 
numerical derivations, and results to the appendix. Appendix |A.1| and |A.2| provide splines definitions and 
its required derivations. In Appendix we give detailed information on the computations, simulation 
study, and the intestinal microbiota. A Matlab implementation of the main procedures is provided in 
Appendix 0 including continuousShooting.m, cubicSplines .m, and lotkaVolterra.m. Additionally a 
zip file is provided or can be downloaded at www.math.vt.edu/people/mcchung/ with all relevant Matlab 
codes to run an example problem (execute exampleScript .m). 


1.1 Biology of Microbial Communities 

Bacteria are ubiquitous in our world, and play a key role in maintaining the health of our environment as 
well as the health of virtually all living organisms. The human associated microbial communities (human 
microbiota) have been shown to be at least associated with, if not causative of, several human diseases such 
as periodontitis [3S], type 2 diabetes [33], atopic dermatitis [33], ulcerative colitis [IS], Crohn’s disease m, 
and vaginosis [35]. Furthermore, time series data have shown that the host-associated microbiota undergo 
dynamic changes over time within a same individual, e.g., within the gut of a developing infant [291122] . 
the gut, mouth, and skin of healthy adults [5], and within the vagina of reproductive age women m- 
The mechanisms that underlie these changes are currently not well understood, whether they represent 
fluctuations in the normal flora, or the transition from health to disease, and conversely from disease to 
health after treatment. 

Understanding the role human-associated microbial communities play in health and disease requires the 
elucidation of the complex networks of interactions between the microbes and between microbes and the 
host, a challenging task due to our inability to directly observe bacterial interactions. Instead, researchers 
have reconstructed microbial networks based on indirect approaches, such as knowledge about the metabolic 
functions encoded in the genomes of the interacting partners [24] . coexistence patterns across multiple 
samples [9], covariance of abundance across samples [13], or changes in abundance across time [SH] 139] . 
Multiple mathematical formalisms have been used to reason about the resulting networks with examples 
including metabolic modeling through flux balance analysis mi, machine learning algorithms based on 
environmental parameters m, and differential equation based models of interactions [39] . 

Here we focus on the latter, a flexible formalism that can model complex interaction patterns, including 
abundance-dependent interaction parameters [43] . While such modeling approaches have been developed 
since the 1980s in the context of wastewater treatment systems [19] . their use in studying human-associated 
microbial communities has been limited, in no small part due to the specific characteristics of human mi- 
crobiome data. First, the rate at which samples can be collected is severely limited by clinical and logistical 
factors, e.g., stool samples can be collected roughly on a daily basis, while subgingival plaque may only be 
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feasibly collected at an interval of several months. Second, microbiome data are sparse, i.e., most organisms 
are undetected in most samples [3Dj due to the detection limits of sequencing-based assays as well as the high 
variability of the microbiota across the human population. Third, it is difficult if not impossible to directly 
measure environmental parameters, such as nutrient concentrations, that may impact the microbiota. 

These features of the data derived from the human-associated microbiota lead to an ill-posed parameter 
estimation problem since multiple parameter sets may be consistent with the data. Numerical instabilities 
that result from specific parameter sets can also cause traditionally used estimation procedures to fail. Here 
we explore solutions to the parameter estimation problems in the context of the Lotka-Volterra formalism, 
which is described in more detail below. 

1.2 Mathematical Modeling of Microbial Communities 

We focus on a special type of differential equation based models of interactions, the Lotka-Volterra model, 
which is named after Alfred J. Lotka (1880-1949), an American mathematician, physical chemist, and statis¬ 
tician, and Vito Volterra (1860-1940), an Italian mathematician and physicist [3]. This model was originally 
developed in the context of predator-prey interactions; however, it can be generalized to more complex in¬ 
teractions. Let y be the time dependent state variable for the dynamics of n species with time variable t. 
Then the Lotka-Volterra system can be written as 

y'= f(y) = diag(y) (b-P Ay). (1) 

Here, diag(y) is the diagonal matrix with the state variables y = [j/i,..., as its entries, and b = 
[bi,... ,bn]^ G M" is the intrinsic growth rate, which incorporates the natural birth and death rate of each 
species in a given environment. Negative bi refers to a negative intrinsic growth rate and species i’s survival 
depends on the interaction with other species. The matrix A G represents the dynamics of the 

relationships between the species and is often referred to as the interaction matrix. An element = [A]i j 
of A describes the influence of species j on the growth of species i. For i ^ j and < 0, we classically 
consider species i to be a prey of predator j and vice versa for a^- > 0. If for i ^ j both aij < 0 and aji < 0, 
species i and j are competing for existence. On the other hand, ii i ^ j both aij > 0 and aji > 0, symbiotic 
behavior between species i and j is observed. If aij = 0, i j, no direct interaction between species i and j 
exists. 

This formalism allows the simulation of ecological systems and the study of the long-term behavior 
of these systems. For example, the equilibrium solution yoo = 0 describes the extinction of all species. 
The equilibrium yoo = 0 is unstable if and only if at least one intrinsic growth rate bi is positive. All 
other biologically feasible, i.e., nonnegative, equilibrium solutions yoo of Q are solutions of the equation 
0 = diag(y) (b -I- Ay). The real parts of the eigenvalues of fy(yoo) determine the stability of the additional 
equilibria. Here, fy(yoo) is the Jacobian of f with respect to y evaluated at yoo with fy(y) = diag(b) -|- 
diag(y) A -f diag(Ay). Detailed analysis on population dynamics, persistence, and stability can be found 
in [37], and the specifics for the dynamics of Lotka-Volterra systems can be found in |32] • 

Assumptions and Prior Knowledge 

The Lotka-Volterra system makes simplifying assumptions about the underlying biological system, in partic¬ 
ular it assumes that the interaction between microbes is constant in time and independent of the abundance 
of the interacting partners. As a result, certain types of microbial interactions, e.g., quorum sensing, cannot 
be effectively modeled. For the sake of computational tractability, we restrict ourselves to this traditional 
definition of the Lotka-Volterra model, but extensions that allow more complex interaction modalities |43| 
can also be addressed by the computational parameter estimation framework described below. 

The number of parameters of the Lotka-Volterra model is proportional to the square of the number of 
interacting partners, complicating the parameter estimation problem for complex datasets, especially when 
the number of samples is limited. Prior knowledge about the system is often available and can be used to 
mitigate the complexity of parameter estimation. In the context of host-associated microbiota, this prior 
knowledge may include: 
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Known Parameters. Intrinsic growth rates or specific interactions might be known or partially known 
prior to parameter estimation. 

Grouping. A reduction in the size of the system in Q can be achieved if species with similar behavior can 
be pooled together into a meta-species. 

Biomass. Often a reasonable assumption is the total biomass in a dynamical system remains constant or 
is tightly regulated at all time. 

Symmetry. Knowing the influence of species j on species i may simultaneously give information on both 
interaction parameters Uij and Uji. 

Finite Carrying Capacities. It can be assumed that all species display logistic growth and have a finite 
carrying capacity in the absence of all other species. 

Sparsity. For some biological systems, an interaction between species or the lack thereof might be known. 
Even in cases when an exact sparsity pattern is not known, knowledge of interaction sparsity can be 
informative. 

In this work we develop a novel parameter estimation method to recover intrinsic information of the 
dynamical system, i.e., the parameters, or predictors as they are called in statistics, given temporal density 
observations of the interacting species. In addition, this method is flexible enough to make use of any prior 
knowledge of the biological system such as the possible assumptions listed above. In particular, we propose 
the explicit incorporation of a sparsity assumption in the model. 


2 Methods 

2.1 Problem Statement 

In order to validate and impel model predictions, the mathematical model needs to be compared to ex¬ 
perimentally observed data d S K"*. The estimation of parameters for dynamical systems is a key step 
in the analysis of biological systems. The point estimates give quantitative information about the system, 
and in the case of a Lotka-Volterra system specifically, the intrinsic growth rates and interaction dynamics 
between species. Let us assume intrinsic growth rates b and interaction dynamics A are unknown and are 
collected in the parameter vector p = [b;vec(A)], where vec(A) is the concatenation of the columns of A, 
i.e., vec(A) = [an,..., a„i, ai 2 ,..., 

The general parameter estimation problem for any explicit first order ordinary differential equation (not 
just restricted to 0 ) is stated as the following constrained weighted least squares problem 

(P,y) = argmin||W(ni(y(t;p)) - d )||2 

(p^y) (2) 

subject to y' = f(t,y,p) and c(p,y) = 0 , 

with y : [a, 5] x M" — >■ K". First, note that additional constraints such as prior knowledge discussed above 
are gathered by the general statement c(p,y) = 0 . For every feasible parameter set p and initial state 
yo = y(a) we assume that the conditions of the Theorem of Picard-Lindelof are fulfilled. This means that 
a unique state variable y exists for any feasible choice p and yg. For our focus, the Lotka-Volterra system 
fulfills this condition for any finite p and yg. Further, m : K" —)■ K™ is a projection from the state space 
onto the measurement space of given data d = [di,..., dmY ^ For instance, observations d might not 
include all states at all time points, d might be in a frequency domain, or d may only be a combination of 
observed states. The precision matrix W^W is also referred to as the inverse covariance matrix, and the 
optimization problem ([^ can be seen as a weighted least squares problem with weight matrix W. Here, 
we assume that we have independent samples d and W = diag(w) = diag([ri;i,... ,WmY), with Wj > 0 for 
j = 1,... ,m. If iCj- is large, observation dj plays an important role for the parameter estimation procedure. 
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Otherwise, small Wj indicate a lesser role for observation dj. The underlying statistical assumption for this 
parameter estimation problem is that the residuals are normal distributed and in case of a diagonal W are 
uncorrelated [7j. 

Computationally, solving ([^ may be challenging. A limited number of observations, high levels of noise 
in the data, large dynamical systems, non-linearity of the system, and a large number of unknown parameters 
are all such challenges. These challenges appear for Lotka-Volterra models of biological systems and make 
parameter estimation extremely difficult [5]. Next, we note the methods traditionally used to solve the 
parameter estimation problem, discuss the limitations of these methods, and present a novel alternative. 

Traditional Methods 

Typically, single shooting methods are utilized to solve ([^ for biological systems gniii]. For single shooting 
methods, first, initial guesses for and yjj are used to numerically solve the initial value problem (forward 
problem) using single- or multi-step methods such as Runge-Kutta and Adams-Bashforth methods [ITl [18] . 
Next, the misfit between data and model is computed, and depending on the optimization strategy (e.g., 
gradient based strategies such as Gauss-Newton methods or direct search approaches such as the Nelder- 
Mead Method), a new set (p^,yo) is chosen. This process continues until a p and yo are found to fulfill 
pre-defined optimality criteria. Since most efficient optimization methods are typically local optimization 
methods, globalization is achieved, for example, by Monte Carlo sampling of the search space, i.e., repeated 
local optimization with random initial guesses m- The global minimizer chosen from the set of local 
minimizers is the local minimizer (p,yo) with minimal function value. Various other strategies for global 
optimization can be applied, such as simulated annealing m, evolutionary algorithms |38| , or particle swarm 
optimization methods |20] . 

The main drawback of these methods, though, is the numerical forward solver often fails to integrate 
with the required precision meaning the calculations of the forward solution and the misht between the 
data and model are not possible. It has been established that single shooting methods are not robust to 
initial guesses p° and yj), which in this case refers to the methods’ ability to successfully find minimizers as 
dehned in ([^ . Various alternatives have been developed to compensate for the lack of robustness [26115]. In 
particular, multiple shooting methods divide the time interval in several smaller subintervals, introducing 
initial conditions for each subinterval and solves the individual initial value problems on each subinterval. 
Constrained optimization methods will ensure continuity of the optimized state variable |S] when using 
multiple shooting methods, so these methods introduce more robustness when compared to single shooting 
methods by the reformulating the problem as a constrained optimization. 

2.2 Continuous Shooting 

In this manuscript we follow another approach similarly discussed in |32llini|34] by using an interpretation 
of a collocation-type method [T]. Instead of solving the original optimization problem we approximate 
the elements of the state variable for the system by a function s € S, where S is an any function space dense 
in C^([a, 6]), such as the cubic spline space we use here. Furthermore, we relax ([^ and use the standard 
approach of discretize-then-optimize [23 SI] • 

We first reformulate ([^ as 

min ||W(m(t;y) -d )||2 

(P,y) (3) 

subject to ||y'- f(t,y,p)||£p = 0 and c(p,y)=0, 

where || • is any appropriate integral norm with p = 2 for our later examples. The problems defined in ([^ 
and @ are equivalent since y is required to be continuous. Choosing a distance metric V and relaxing the 
optimization problem leads to 

min ||W(m(t;y) - d )||2 -f A |ly' - i{t, +aV{c{p,y)). 

(p.y) 
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Here, A > 0 and a > 0 can be seen as either Lagrange multipliers or regularization parameters [551 SI]- 
The parameter A has the effect that if A is small, the main contributor to the minimization process is the 
data misfit. On the other hand, if A is large, the weight of contribution shifts to the model equations. A 
similar interpretation holds for a. Note, further, that the conversion of the constraints c(p,y) = 0 to “soft” 
constraints is also optional. The constraints can remain as hard constraints and interior point methods or 
augmented Lagrangian methods can be utilized [55] . 

Let 6]) be the set of cubic splines with knots a = tq <■■■< Tk = b with a chosen set of boundary 

conditions, e.g., not-a-knot conditions, then every s G iS.^([a, 6]) is uniquely determined by a set of parameters 
q = [go, • • ■, QkV ■ A vectorized spline s defined by s = [si, ..., with sj G b]) for j = 1,..., n is 

then uniquely determined by a parameter vector q = [qi,..., qri]^. Approximating the state variable y by 
s, discretizing the integral of the /1^-norm, and normalizing A finally leads to the optimization problem 

min J(p,q) = ||W(m(t; s(t; q)) - d )||2 + — |!s'(T;q) - f(T, s(T; q), p )||2 
(p,q) ni 

+ aX>(c(p,s(t;q)), (4) 

where we define 



V(Ti;q)- 


■fm. 

s(ri;q),p)' 

s'(T;q) = 


, f(T,s(T;q),p) = 




y(T>;q). 


m, 

s{Ti; q),p)_ 


and a = Ti <■■■< Ti = b is a discretization of the interval [a, b]. We refer to this method as continuous 
shooting. 


Numerics 


To numerically solve Q with respect to p and q, we can utilize Gauss-Newton type methods. Suppose V 
is the two-norm, so a standard optimization algorithm can be written as follows. Let the residuals of 0 be 
defined by 



ri 


W(m(t;s(t;q)) - d) 

r = 

I’2 

= 

yA(s'(T;q)-f(T,s(T;q),p)) 


A3. 


yac(p,s(t;q)). 


and the Jacobian of r be given by 



r9ri 

Op 

aril 

aq 


0 

WnisSq 

J = 

8r2 

dp 

dr2 

aq 

= 

-VAfp(T) 

yA(s;(T)-f,(T)sq(T)) 


9r3 

Lap 

ar3 

aqj 





with the appropriate abbreviations 


5s 

5q 

dq 

d_ 

dq 


m(t; s). 

fp(T) = Af(T,s,p). 

-s{t;q), 

f.(T) = Af(T.s,p), 

^s(T;q), 

d , 

Cp = ^c(p,q), 

^s'(T;q), 

c. = ^c(p,s). 


Then the gradient of the objective function J is given by g = VJ'(p,q) = 2J^r and the Gauss-Newton 
approximation on the Hessian is H = V^J'(p,q) « 2J^J. Finding the structures for m, s, f, c, and 
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their corresponding partial derivatives can be done analytically. The details for the construction of the 
cubic splines s and their partial derivatives can be found in Appendix |A.1| and A.2 and the details for the 
construction of f and its partial derivatives when using the Lotka-Volterra system as a model can be found 
in Appendix [Bj 

The final remaining piece that needs to be acknowledged in the problem defined by Q is the determination 
of the regularization parameter(s) A and, if present, a. The goal in choosing the regularization parameters A 
and OL is to choose values such that the model system and any additional constraints are sufficiently weighted 
to prevent data overfitting but not overweighted to cause data underfitting. To accomplish this, we use 
fc-fold cross-validation to select the regularization parameters m- The reason we use cross-validation here 
is the method allows us to use a subset of known data to train our model by solving Q and the remaining 
known data to test the model’s predictive abilities. This is particularly useful in biological applications such 
as the ones being discussed here because the available data are often limited and hard to collect. 
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Figure 1. Extended numerical solution of a Lotka-Volterra system. 

The plotted solution is from time t = 0 to t = 200 and demonstrates the chaotic behavior of the system. 


3 Results 

In this section we share our findings with regard to our method’s viability and its applicability using simula¬ 
tion studies and previously collected and published data for intestinal microbiota, respectively. In both sets 
of findings, we focused on the Lotka-Volterra system of differential equations as a model and an unknown 
sparsity pattern in the interaction matrix as an assumption on the model. 
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3.1 Simulation Studies 


On the interval t S [0,10], we defined a four dimensional Lotka-Volterra system using the parameters and 
initial conditions 
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In this case, the interaction matrix displayed 37.5 percent sparsity. Using the parameters and initial condi¬ 
tions, we numerically solved the initial value problem with the differential equation given by Q. In Figure 
a phase plot of the solutions for States 1 through 4 indicates the system displayed chaotic dynamics, and it 
is inherently difficult for parameter estimation methods to find parameters of chaotic systems. 

We then used the numerical solution to generate three sets of data with different levels of multiplicative 
noise (Study 1: 0 percent noise; Study 2: up to 10 percent noise; Study 3: up to 25 percent noise) and applied 
our parameter estimation method using each data set. The data for each study are shown in Figure]^ and 
the remaining details for the problem setup are included in Appendix |B.2[ The code provided in Appendix [C| 
also allows for setting up similar studies. 


State 1 



State 2 



State 3 State 4 




Figure 2. Numerical solution of a Lotka-Volterra system and data points for simulation 
studies. 

The black curves indicate the numerical solutions used to generate the data. The blue dots are the data 
with no multiplicative noise. The red dots are the data with up to 10 percent multiplicative noise, and the 
yellow dots are the data with up to 25 percent multiplicative noise. 


The state solutions of the system found using the optimal model parameters recovered the data very well 
















in all three studies. The relative errors, defined as = — 

’ ' m. 


m(y)-d 

d 


1 


where m is the number of elements 


in d and the division is element-wise, were approximately ~ 0.0331, ~ 0.0926, and ~ 0.1511 for the 

three studies, respectively. 

As Figure demonstrates, we also compared our optimal model parameters to their true values, and 
the absolute errors in the optimal model parameters suggested the true model parameters and the system 
dynamics were recovered very well in all three studies. 
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Figure 3. Absolute errors of optimal model parameters and initial conditions. 

The images show the absolute errors in the recovery of the model parameters and initial conditions for all 
three simulation studies. 

By plotting the relative error between the spline solutions and the state solutions in Figure all three 
studies additionally illustrated that the optimal spline functions also proved to be a good approximation of 
the state solutions found using the optimal model parameters. 

3.2 Interactions within the Intestinal Microbiota 

The generalized Lotka-Volterra formalism was recently used to explore the impact of the intestinal microbiota 
on the development of antibiotic-induced Clostridium difficile colitis [39]. This disease occurs in patients 
who have been treated with antibiotics and is characterized by a marked shift of the intestinal microbiota 
towards a state dominated by the pathogen Clostridium difficile. In many cases health can only be restored 
through a fecal transplant, a procedure which restores the diversity of the microbiota. The mechanisms 
through which disease occurs, and through which the normal gut microbiota can prevent the over-growth of 
C. difficile are still not well understood. 

In Stein et al. [31] the authors relied on a mouse model of C. difficile colitis to attempt to address these 
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Figure 4. Relative error between spline solutions and state solutions. 

Each row shows the relative error for a single state for all three simulation studies. The relative error 
between the spline solution and state solution is calculated as over the time interval [0,10] with s 
denoting the optimal spline approximation and y denoting the numerical solution of the optimal 
Lotka-Volterra system. 


questions. They tracked the microbiota of mice across time and used the resulting data to estimate the 
parameters of a Lotka-Volterra model. Based on the resulting model they were able to provide new testable 
hypotheses about the factors that promote the overgrowth of C. difficile following a course of clindamycin. 
Here we used the same data and model and added the assumption of interaction sparsity to test our parameter 
estimation procedure and compare to the originally published results. 

We focused on a subset of the Stein et al. data, specifically data originating from three mice who had 
not been subjected to any antibiotic interventions. The exact details, including how we setup our problem, 
can be found in Appendix |B.3| 

In our simulation, the spline solutions remained good approximations to the state solutions as demon¬ 
strated in Figure]^ The relative errors for the spline solutions for the Blautia and Coprobacillus OTUs were 
larger than the relative errors for the other five OTUs, but this was due to both the significantly smaller 
magnitude of the data for these OTUs and the magnitude of the weights for the data relative to the other 
OTUs. Among the three replicates for a single OTU, variations in the magnitude of the relative errors, 
e.g., in Blautia, Unclassified Mollicutes, and Coprobacillus, were explained by noticeable variations in the 
magnitude of the data across replicates. 
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Figure 5. Relative error between spline solutions and state solutions. 

Each row shows the relative error for a single OTU for all three replicate experiments. The relative error 
between the spline solution and state solution is calculated as over the time interval [1,21] with s 
denoting the optimal spline approximation and y denoting the numerical solution of the optimal 
Lotka-Volterra system. Time is measured in days. 
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Figure 6. Comparison of state solutions. 

Each row compares the state solutions for a single OTU for all three replicate experiments using our 
parameter estimation results and those found in Stein et al. [53] ■ The black dots indicate the provided 
experimental data. The blue curves mark the state solutions found using our optimal Lotka-Volterra 
system, and the red curves denote the state solutions found using the optimal Lotka-Volterra system 
published by Stein et al. Time is measured in days, and abundance is measured in 10^^ DNA copies per 
cubic centimeter. 
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As in the simulation studies, we also assess our method’s ability to recover the data. The relative error 
between the state solutions using the optimal model parameters and the data for all three mice, given by 
Gr — ^ with m being the number of elements in d and the division being element-wise, was 

Cr ~ 0.4594. The relative error for the model published in [35] was e^. « 3.6790, indicating that our method 
more accurately captured the dynamics of the data. This fact was further confirmed by a visual comparison 
of the state solutions to the data in Figure]^ 



Figure 7. Comparison of interaction matrices. 

This figure compares (A) the interaction matrix from our optimal Lotka-Volterra system to (B) a subset of 
the interaction matrix published by Stein et al. |39j . In the graphs, entry in the interaction matrix is 
given as a directed edge from node j to node i. The value of the entry a^- is given by the color of the edge. 
OTUs: 1-Blautia, 2-Barnesiella, 3- Unclassified Mollicutes, 4- Undefined Lachnospiraceae, 5- Unclassified 
Lachnospiraceae, 6-Coprobacillus, and 7-Other. 

The biological implication of the disagreement between our results and those originally published on the 
same data became apparent when examining graph representations of the Lotka-Volterra interaction matrices 
in Figure[^and considering a recent paper from the same group providing experimental evidence for the role of 
the gut microbiota in the prevention of C. difficile infection [^. Stein et al. originally concluded, on the basis 
of Lotka-Volterra modeling, that members of the Coprobacillus genus inhibit the growth of other members 
of the gut microbiome, which implied Coprobacillus is a stabilizing factor within the gut microbiome. In 
our own analysis of their data, we did not identify any strong interactions between the Coprobacillus OTU 
and other organisms. Instead we observed inhibitory interactions of members of the Lachnospiraceae family 
with other gut microbes, which suggested that members of the Undefined Lachnospiraceae and Unclassified 
Laehnospiraceae groups are the more likely players involved in preventing C. difficile colonization. Buffie 
et al. confirmed this experimentally |6]. In their paper they showed that the Lachnospiraceae species 
Clostridium scindens can provide resistance to C. difficile colonization in a mouse model of C. difficile 
enterocolytis. 

4 Discussion 

Method Analysis 

In the Methods section we derive our approach given by Q from the typical parameter estimation problem 
given by ([^. In our approach, we do not find a solution to the original problem statement, but we solve a 
“nearby” problem instead with the idea that is numerically more robust. It is feasible to criticize that the 
optimization problem being solved is just an approximation to the solution of (§ , but even in cases where the 
solution to Q is not sufficiently accurate, this method can be used to efficiently precompute approximations 
for p and yoj which can then be used as initial guesses for single or multiple shooting methods. 
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One step in constructing our “nearby” problem is replacing the state variable y in the model with an 
approximation s. In our case, we use cubic spline functions for s, and for s = [si,..., the approximation 
error for each Sj G 5^([a, &]), j = ^, ■ ■ ■ jn, is bounded using the theorem below. 

Theorem ([3T]). Let m be a positive integer. For everyy G C^dajb]) and for every integer j G {1,..., min(TO, 4)}, 
the least maximum error satisfies the condition 


4! 1 

min ll-y — sll.^ < 7- r^ — h^ 

sG53([a.b]) ^ °° (4-j)!2J 


.,U) 


where h = max {t^+i — : i = 0,..., fc — 1}. 

Since yj G C^([a, 5]) for j = the bound on the approximation error for each Sj G 6]), j = 

1,..., n, simplifies to 

Also, while the “nearby problem” is numerically more robust than the problem given in ([^, the dimension 
of the optimization problem increases. This can adversely affect the speed of the optimization step, but is 
also counteracted by improvements in computational efficiency elsewhere. One example is that optimization 
steps in Q never require the calculation of the solution of the ODE model. Eliminating the need for the 
solution of the initial value problem removes a computationally intensive step in each optimization iteration 
and replaces the step with the analytic evaluation of the spline vector s and its time derivative s'. 
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Figure 8. Cumulative local sensitivities of optimized model parameters and initial conditions. 

The images display the cumulative local sensitivities for every optimal model parameter and initial 
condition for all three simulation studies. 
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Simulation Studies 


As an unknown sparsity pattern was an assumption in our model for these simulation studies, our method’s 
ability to capture the true sparsity pattern is relevant. Due to numerical and computational limitations, it 
is unlikely for any model parameter in the optimal set to be identically 0, so we instead consider any model 
parameters with a magnitude below a certain threshold, 10“^ in this case, to essentially be 0. The sparsity 
structure is perfectly preserved in the first simulation study, 87.5 percent recovered in the second, and 62.5 
percent recovered in the third. This is what we would expect as the increase in data noise over the three 
simulation studies should have an increasingly significant effect on the accuracy of the model parameters 
that our method returns. 

A model’s local sensitivity to the optimal model parameters and initial conditions, which is the solution 
of the initial value problem 


y 



f(i,y,p) 


y(a) 


yo 

9y 

Sp 


= 

|,f(i,y,p) 

1 

|,y(a) 

= 

0 

9y 

8yo. 

} 




a|iy(a) 


vec(I„)_ 


over the interval [a, b], is also often of interest. To have some idea of the cumulative effect of a single parameter 
or initial condition on the entire n-state Lotka-Volterra system we calculate Sp- for i = 1,... ,n^ + n and 
Syg . for z = 1,..., n with 


Sr, 



dt and Sy^ , 



dt, 


respectively, and provide the results in Figure]^ 

The cumulative sensitivities remain consistent across the three simulation studies, which is to be expected 
given the consistency across the simulation studies in the optimal model parameters and initial conditions 
themselves. 


Interactions within the Intestinal Microbiota 


As with the simulation studies, we can calculate the cumulative local sensitivities with respect to the optimal 
model parameters and initial conditions. Note that because of how the parameter estimation problem is setup 
(see Appendix B.3|, the resulting model parameters for each of the three replicates are the same, but the 
optimal initial conditions differ. This means the local sensitivities and hence the cumulative sensitivities can 
vary by replicate, yet the results in Figure display consistency in the cumulative sensitivities across the 
replicates. 
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Figure 9. Cumulative local sensitivities of optimized model parameters and initial conditions. 

The images display the cumulative local sensitivities for every optimal model parameter and initial 
condition for all three replicate experiments. 


Biologically speaking, our method demonstrates the ability to closely model real biological data, but we 
would like to note that the resulting model is still unable to provide a full forward prediction of the intestinal 
microbiome’s state. The Lotka-Volterra system appears to be unstable leading to the uncontrolled growth, or 
the disappearance of certain taxa, phenomena not commonly observed in real data. In part this is due to the 
limitations of the Lotka-Volterra system and the simplifying assumptions made when choosing it as a model. 
Insufficient data both in terms of the relatively small number of samples and, more importantly, in terms 
of the sparse sampling rate also play a role. Unfortunately, these limitations are inherent as computational 
costs can require model simplifications, experimental costs can limit the number of feasibly obtained samples, 
and the specific microbiome being sampled can potentially limit the sampling frequency. 

Despite these limitations, modeling approaches such as the few described here are still extremely useful 
in understanding host-associated microbial communities. In particular, we showed that we could identify 
interactions and their direction between members of the gut microbiome that were later confirmed by in 
vivo experiments. The mathematical framework we described here correctly identified the Lachnospiraceae 
family as playing a stabilizing role in the gut microbiome, which contrasts with the suggestion that the 
Coprobacillus OTU plays this role as previously predicted, and while the full dynamics of the system cannot 
yet be predicted, knowledge of the interactions and their direction can sufficiently guide further biological 
experimentation. Therefore, we suggest that computational approaches such as ours effectively combined 
with experimental approaches can help elucidate the role of host-associated microbes in health and disease. 
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A Cubic Spline Construction 

Using gradient-based optimization methods for solving Q requires the derivatives of the spline s with respect 
to the time and the coefficients q. In particular we need to compute s, Sq, s', and Sq. Here, we provide the 
required derivations for the cubic spline function s and its derivatives. 


A.l Spline Definition 

Let data points a = to <■••< tn — b and any corresponding real numbers qj, j = 0,... ,n been given. 


Let s : [a, 5] —>■ K be a function with the interpolation properties 

for J=0, 

that is equal to a cubic polynomial with coefficients aj, bj, Cj, and dj on each interval i.e., 

~ ~ ~ — tj) + dj 

for tj <t < tj+i, j = 0,..., n — 1, and is twice differentiable, i.e., 

for j = 0,..., n — 2. We refer to s as a cubic spline m- 


(5) 

( 6 ) 

(7) 


Equations (H,® and Q provide 4n—2 conditions for the 4n coefficients Uj, bj, Cj, and dj, j = 0,..., n—1. 
A spline s is now uniquely determined by choosing appropriate boundary conditions. Here, we choose not- 
a-knot boundary conditions, i.e., So'(ti) = s'”{ti) and For efficient calculations of 

the coefficients aj, bj, Cj, and dj, we define moments 

= s''{tj) 


for j = 0,..., n. Note that the second derivative of s is linear in each interval \tj,tj+i], and for t £ [tj,tj+i], 
the moments rrij have the expression 


s'' (t) = TO, 


tj+i — t 

hj+i 


■ ruj+i 


hj+i 


with hj+i = tjj .1 — tj for j = 0,..., n — 1. Integrating and using equation we get 

{tj+i-tf {t-tjf , dj+i-Qj hj+i 


s'j{t) = -r 
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-i+i, 


(toj+i - TOj) (t - tj) + Qj - rtij - 
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hj+i 6 

With the condition s'(t^+i) = s'j^i{tjj.i) for j = 1,..., n — 1 we get the equation 

<lj+i-qj Qj-qj-i 

yTOj-i H --TOj -b —^TOj + i =- 


L+i 


which can be rewritten as 


(I - Xj)mj_i + 2mj + AjTOj+i = ^ 


6 f Qj+i - Qj Qj - Qj-i 


■j+i 
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Aj = for j = 1 ,... ,n - 1 . 

Now the moments m = [mg,..., mn]^ can be calculated by solving the linear system 

Am = f3. 

Here, /3 = [/3o,..., PnV with (3^ = 7 ^-+!^ for j = 1,..., n - 1 and 


A = 


Aq 

A 

A„ 


(n+l)x(n+l) 


with 


A = 


1 — Ai 2 Ai 


1 Ayi_l 2 An_i_ 

The entries Ag, A„, /3g, and /3n are specified by the boundary conditions. For the not-a-knot conditions, 


Ag = 


A77, — 


11 1 1 „ 

~T“:T— 

hi hi h 2 /12 




1 


1 1 


hn — 1 1 hji 


l3o= /3n= 0. 

With m given, coefficients Uj, bj, Cj, and dj are determined by 

dj = qj 

_ qj+i - qj {2mj + mj+i)hj+i 


^ h 

6 - — — 
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■j+i 


( 8 ) 


TOj+i — mj 


6h 


■j+i 


for j = 0 ,..., n — 1 . 


A.2 Derivatives of the Spline Function 

We are interested in calculating Sq, where q = [(?g,..., qnY■ With equation ®) we get 
®qW|te[q.q+i] = 

= ^ («j(i - ^3? + - tjf + Cj{t - tj) + dj) 

, NX da, , Nxdfci , . dc, dd,- 
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First, the coefficients aj, bj, Cj, and dj depend on the moments m. The derivative ^ ^/3) = 

A-iM = a-^B with 




0 0 

Ml ~Mi(l ~ ''^i) 

—M2A2 M2 ~M 2(1 — A2) 


0 


Mn —1 Mn—1(1 1) 

0 0 


for the not-a-knot boundary conditions and for fj,j = — ^ , j = — 1. We can calculate ^ by 

solving the linear systems AM = B where M = ^. 

Finally, the derivatives of the coefficients with respect to q, i.e., ^,and^witha= [oq, ..., b 

[60,..., bn-i]^, c = [cq, ..., Cn-i]^ and d = [do ,..., can be computed using Q and are given by 


dd 

dq 

dc 

dq 


= E 
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— H_i 0 (Eg — E„) 
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Hi©(2Eo + E„)M 


db 

dq 

da 

dq 


= ^E„M 

= ^H.i © (Eg - E„)M, 


where Eg = [0, I„], E„ = [I„,0], I„ is the n X n identity matrix, Hi = [Mi,..., M„]^ © [!,...,!], and 
H_i = [1/Mi,..., 1/hnY ® [Ij • • • I !]■ The symbols © and © denote the Kronecker product the Hadamard 
product, respectively. 

The further required derivatives s' and can now be easily computed by 


and 


<(i) 


[tj + 




dq 


+ 2(t tj) 


dq 
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B Computational Details 

To run continuous shooting with a gradient-based optimization method on a problem that uses a Lotka- 
Volterra model with assumptions on sparsity, we require specific derivatives of the model and a mathematical 
formulation of the sparsity assumption. Here, we include these details and also our step-by-step approaches 
to both the simulation studies and the intestinal microbiota example. 


B.l Numerics for Lotka-Volterra Models and Sparsity Constraints 

Using the Lotka-Volterra system Q as our model, we assume the parameters p = [b; vec(A)] are considered 
unknown. Let us also replace the model state variable y with its approximation s. The derivatives of f with 
respect to s and p are given by 


fs = diag(b) -I- diag(s) A -|- diag(As) 
fp = [diag(s), © diag(s)] 
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with 0 denoting the Kronecker product. 

To include a sparsity constraint on the interaction matrix A in Q, the constraint term al?(c(p, s(t; q)) 
becomes a ||vec(A)|| In this case T) is the one-norm, and c is the function that maps p to a vector of the 
parameters in A. Note that this constraint is not differentiable everywhere, but one way to overcome this is 
by approximating the 1-norm using a smooth function. The approximation we use is 

|!vec(A)||^ « ^ (9) 


where is the Huber function defined by 

|a;| > e 
|x| < £. 

The idea here is that the function ||vec(A)||j^ is approximated by a smooth quadratic curve near its corners 
with “near” being defined by the choice of £. Another important note regarding this modification to the 
objective function is that the effect on the numerics of the method. The data-fitting and model-fitting 
contributions to the function, gradient, and Hessian terms can be calculated as before. The contributions 
of the sparsity constraint term, however, require the first and second derivatives with respect to p and q of 
the approximation given in ([^. 
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B.2 Simulation Study Details 


Given the Lotka-Volterra system with parameters and initial conditions 
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we numerically solve the initial value problem and let ytrue denote this forward solution. We then collect the 
values of ytrue at the times given by the uniform discretization 0 = < ... < ^20 = 10 and perturb them 

to generate the data d = vec(D) = vec([di,..., d 2 o]). Here, dj = (1 -|- £j)ytrue,j for j = 1,..., 20 with 
representing a scaled vector of independent and identically Beta distributed noise, i.e., ~ 7 • (Beta(2, 2) — 

1/2). We conduct three studies, each with different noise level scales (Study 1: 7 = 0; Study 2: 7 = 0.1; 
Study 3: 7 = 0.25). 

In the objective function, the projection m is the identity projection for all three studies, but the weight 
matrix W is different for each study because it depends on the data and is taken to be W = I 20 ® 
diag([?ci,..., ^ 4 ]^) where I 20 is the 20 x 20 identity matrix and Wi = 10 /cr(Di), z = 1 ,..., 4, is the linearly 
scaled weighting of the inverse standard deviation of state i’s time-series data. Additionally, we use the 
standard constraint term in the objective function with a sparsity constraint on the interaction matrix 
A, which was defined earlier. For each study, we separately sample 1,000 (A, Q;)-pairs from the square 
[1,100] X [0.01,1] and choose a pair using leave-one-out cross-validation. The (A, a)-pairs are approximately 
(1.1416,0.01261), (5.5098,0.04584), (23.8228,0.87599) for studies 1, 2, and 3, respectively. 

For each study, we then separately sample the parameter space 10,000 times using a Latin hypercube 
sampling and perform local optimizations using the Gauss-Newton method with each sample serving as an 
initial parameter set. The global minimizer is the local minimizer that most minimizes the objective function. 


B.3 Intestinal Microbiota Details 

The data collected consists of the abundance levels for eleven operational taxonomic units (OTUs) on days 
1, 3, 7, 14, and 21 for each of the three mice. We eliminate any OTU that was not present in a measurable 
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amount at all time points for any of the mice, which reduces our data to seven OTUs labeled Blautia, Barne- 
siella, Unclassified Mollicutes, Undefined Lachnospiraceae, Unclassified Lachnospiraceae, Coprobacillus, and 
Other. Here, Other is the eleventh original OTU and is the collection of bacteria not assigned to any of the 
other ten original OTUs. 

For our method we use all 21 (seven OTUs for three mice) time-series as the data, but we model the 
seven OTU interactions using a single, seven state Lotka-Volterra model given by 

y' = f(y) = diag(y) (b + Ay), y(l) = yo- 

This then means that the model given to the objective function is actually 

y' = f(y) = diag(y) (b -1- Ay), y(l) = yg 


with 


y = 

to I—* 

_1 

, b = 

h' 

b 

, A = 

1 

< 

1 _ 

, yo = 

1- 

CMO 

1__ 


[yL 


_b 


A 


LygJ 


The superscripts indicate the data separately collected from each of the three mice, so our method approx¬ 
imates 21 different state variables, corresponding to the 21 total time-series, with spline functions. Those 
approximations, however, are all governed by a single seven-state Lotka-Volterra system defined by the 
intrinsic growth vector b and the interaction matrix A. 

The projection m in the objective function is the identity projection, and the weight matrix W is defined 
as W = l 5 ( 8 )diag([r(;i,..., m2i]^) where I 5 is the 5x5 identity matrix and Wi = l/(100x cr(Di)), i = 1,..., 21. 
We also replace the standard constraint term in the objective function with a sparsity constraint on the 
interaction matrix A, which was defined earlier. To find the regularization parameters A and a, we sample 
100 (A,a)-pairs from the square [1,100] x [10“®, 10“^] and choose a pair using 12-fold cross-validation. The 
(A,a)-pair is approximately (2.6727,5.7508 x 10“®). 

We then sample the parameter space 1,000 times using a Latin hypercube sampling and use these samples 
as initial parameter sets for local optimizations by the Gauss-Newton method. The global minimizer is the 
local minimizer that most minimizes the objective function. 


C Matlab Implementation 

C.l Continuous Shooting 

function [varargout] = continuousShooting(mFcn, nf, t, d, w, p, paramPE) 

7 . 

7, function [optResults] = continuousShooting (mFcn , nf , t, d, w, p, paramPE) 

% 

7« Author : 

% (c) Matthias Chung (mcchung@vt.edu) 

% Justin Krueger (kruegej2@vt.edu) 

7o 

% Date: June 2015 

7. 

7. MATLAB Version: 8.4.0.150421 (R2014b) 

7. 

% Description: 

7« This parameter estimation procedure minimizes 
7o 

% min_(p,q) f(p,q) = a*||w(prjFcn(s(tau,q,t))-d)||''2 + b* lambda * ||s’(tau,q,T)- mFcn(T,s 

(tau,q,T),p)I|‘2, 

7o 

7« for a given ODE model function mFcn, parameter set p, a data set (t,d) and 

7« a weight matrix w. lambda is a regularization on the accuracy of the 
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7, 

% 

7o 

7. 

7. 

7o 

7o 

7. 

7o 

7o 

% 

% 

% 

7o 

7. 

% 

% 

7. 

7. 

7o 

7o 

7. 

7. 

7o 

7, 


7o 

7o 


7, 


% 

% 


% 


7, 


7. 


% 

7o 

7. 

7. 


7. 

7. 

7o 

7o 

7. 

7. 

7o 

7o 

7. 

7. 

7o 

7. 

7. 


model. s is a cubic spline function uniquely determined by the 
parameters q and knots tau. s(tau,q,t) means spline with parameters 
q and knots tau evaluated at times t. s^ is the time derivative 
of spline s . 

By default the method uses a Gauss-Newton optimization method and 
requires the derivatives of f with respect to y and p. 

Inital guess for y is d. Parameter a is set to 1/nd where nd is the 
total number of data points over all states. The parameter b is set to 
l/(nT*nf) where nT is the number of sample times T and nf is the 
dimension of the model. 


This algorithm requires the function cubicSpline.m and possibly 
others depending on the outputs requested. 


Input arguments: 
mF cn 
nf 
t 
d 
w 
P 


model function of ODE y’ = mFcn(t,y,p) of dimension nf 
dimension of the model function 

time points of dimension 1 x n+1 where measurements are taken (or cell) 
data values at times t with dimension m x n+1 (or cell) 
weighting matrix for data values with dimension m x n+1 (or cell) 
inital guess for parameter values 


#paramPE 

optFcn - optimization function to be used [default OgaussNewtonJacobian] 

prjFcn - projection function of model onto observation (data requires 

derivatives of prjFcn) [default @1inearProjection] 

lambda - regularization parameter (accuracy of model equations) [default 1] 

ntau - number of time points used to create initial guess for spline 

parameters q [default 50] 

nT - number of time points used to discretize model misfit norm [default 

100 ] 


regFcn - additional regularization terms [default OregOrganizer] 

alpha - regularization parameters for model parameter regularization (row 

vector) [default 0] 

beta - regularization parameters for spline parameterregularization (row 

vector) [default 0] 


q - initial guess of model solution at time points tau [default does not 

exist] 

paramOPT - all typical parameter to be set for the TIA optimization toolbox [ 
default {}] 


Output arguments: 
varargout 

{1} - structure containing basic information such as minimizers, minimum objective 
function , etc. 


Example: 

modelFcn = OlotkaVolterra; 
nf = 4; 

t = linspace(0, 2 * pi, 20); 
d = [cos(t); sin(t)]; 

w = 10*bsxfun(Qrdivide, ones(size(times)) , std(data,0,2)); 
p = [2 2 3 1 -1 2] ^ ; 
paramPE = {^lambda’, le-1}; 

[optResults] = continuousShooting(modelFcn, nf, t, d, w, p, paramPE); 
References: 


% convert data to appropriate format 
if iscell (t) 

[t, d, w] = rearrangeData(t, d, w); % reshape if non-consistant measuring time points 

end 


7« dimensions of parameter p 
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np = size (p , 1) ; 

’/« set default parameters 
optFcn = OgaussNewtonJacobian; 
prjFcn = OlinearProjection; 
lambda = 1; 

of model equations 
ntau = 50; 
nT = 100; 

misfit 


% optimization algorithm 
‘/o projection onto data 

% regularization parameter for accuracy 

*/o number of knots for spline 
% number of evalauation points for model 


7, set default regularization for model and spline parameters (none) 
regFcn = OregOrganizer; % regFcn will return O’s when used 

alpha = 0; % regularization parameter(s) for model regularization 

beta = 0; 7« regularization parameter(s) for spline regularization 

7« default options for optimization toolbox 
paramOPT = {}; 

% rewrite default options if needed 
if nargin == nargin (mfilename) 
for j = 1: size (paramPE,1) 

eval([paramPE{j ,1}, ’= paramPE{j ,2}; ’]) ; 

end 

end 


y« set outputs of misfit function to be either residual or fuction value 
res = 1; 

if " Strcmp (func2str(optFcn), ’gaussNewtonJacobian’ ) 

res = 0; 

end 


7« time discretizations 

tau = linspace (t(1), t(end), ntau); % set default knots of the spline 

T = linspace (t (1) , t(end), nT) ; % choose default evaluation points of T for model 

misfit s’(tau,q,T) - f(T,s(tau,q,T),p) 

% inital guess for spline parameters if none is given 
if "existC^q’j’var’) 

q = zeros (nf, ntau); 

7» construct q for one state variable at a time 
for j = 1 : nf 

idxData = find(d(j,:) ~= 0); 

7o ensure at least two data points for spline 
if length (idxData) > 2 

q(j,:) = cubicSpline (t ( idxData) , d (j , idxData) , tau); 7, interpolate data for q 
values 

end 

end 

end 


% reshape w and d 
w = w (:) ; 
d = d(:); 

% remove all indices where no data is available 
idxFull = find(w ~= -1); 
nidxFull = length (idxFull); 

% remove all indices where no data is available and all indices the user wishes to ignore 
idx = find(w ~= -1 & w ~= 0); 
nidx = length(idx); 

w = w(idx)/ sqrt (nidxFull); 7» normalization of residual weights w in s(tau,q,t)-d by sqrt(nd) 
d = d(idx); 
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7, pre-compute dsdqD , dsdqM , and dsdqdtM 
[~, dsdqD] = cubicSpline(tau, q, t); 

[~, dsdqM, dsdqdtM] = cubicSpline(tau, q, T); 

misfit 

e = speye (nf , nf) ; 
dSdQD = kron(dsdqD, e); 
dimension 

dSdQM = kron(dsdqM, e); 
dimension 

dSdQdtM = kron (dsdqdtM , e); 
dimension 


% get spline derivatives for data misfit 
% get spline derivatives for model 


% replicate 
% replicate 
% replicate 


dsdqD for correct model 
dsdqM for correct model 
dsdqdtM for correct model 


y« objective function 

oFcn = @(pq) misfitFen(pq, mFcn, nf, d, w, np, prjFcn, idx, nidx, lambda, tau, ntau, T, nT, 
dsdqD, dsdqM, dsdqdtM, dSdQD, dSdQM, dSdQdtM, res, regFcn, alpha, beta); 


7« initial search parameters pq = (p (unknown),q) 
pq = [p; q(:)] ; 


% solve parameter estimation scheme 
[pq, f] = optFcn(oFcn, pq, paramOPT); 


7« extract optimal parameters p and q, 
p = pq(l:np); 

q = reshape (pq(np+1: end) , nf, ntau); 

set has the proper dimensions 
y = cubicSpline(tau, q, t); 

states y using a cubic spline and 
yO = y(: , 1) ; 

conditions for the states y 


approximated state y, and initial condition yO 
y» update optimized parameters 

y, reshape q so the spline parameter 


% construct an approximation of 
optimized spline parameters 

y» provide the optimized initial 


y« store optimization and decomposition results 

varargout{l} = cell2struct ({p, q, yO, y, f}, {^pMin^, ’qMin^, ’yOMin’, ^yMin’, ’fMin’}, 2); 


end 


% - 

% arrange times, data, and weights in a usuable form 
function [t, d, w] = rearrangeData(T, D, W) 

y« create sorted vector of measuring times and remove duplicate entries 
t = unique( sort (cell2mat(T’))) ; 

y, generate -1 at non measured points in d and w 
w = -1* ones( length (T) , length(t)); 
d = -1* ones( length (T) , length(t)); 
for j = l:length(T) 

for i = 1: length (T{j}) 

w(j,T{j}(i) == t) = W-CjJd); y replace -1 any place a weight is given 
d(j,T-Cj}(i) == t) = D{j)-(i); 7 replace -1 any place a data point is given 

end 

end 

end 

% - 

y, calculate the objective function value and/or other needed values 

function [varargout] = misfitFcn(pq, mFcn, nf, d, w, np, prjFcn, idx, nidx, lambda, tau, 
ntau, T, nT, dsdqD, dsdqM, dsdqdtM, dSdQD, dSdQM, dSdQdtM, res, regFcn, alpha, beta) 

y, split search parameter pq into parameters p and q 

p = pq(l:np); % update optimized parameters 

q = reshape (pq(np+1: end) , nf, ntau); 7, reshape q so the spline parameter set has the 

proper dimensions 
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7, normalization of residuals s^(T)- f (T , s (t au , q , T) , p) 
lambda = sqrt (lambda/(nf*nT)); 


7« calculate spline interpolations and derivative of splines 
sD = q*dsdqD’; 
sM = q*dsdqM ^; 
dsdtM = q*dsdqdtM'; 


% [sD] 

7o [sM , dsdtM] 


cubicSpline(tau, q, 
cubicSpline(tau, q, 


t); % evaluate s(tau,q,t) for data misfit 
T); % evaluate s(tau,q,T) for model misfit 


% only r or f is required 
if nargout == 1 

7, calculate model function f 
F = mFcn(T, sM, p); 

7» calculate projection 

prjsD = prjFcn(sD); % calculate projection 
prjsD = prjsD(:); 7« vectorize projection 

prjsD = prjsD(idx); 7« remove indices of unused values from the projection 
7» residual 

r = [w.*(prjsD-d); lambda*(dsdtM(:)-F(:))]; % combined residual of data and model misfit 

7» r is required 
if res 

rR = regFcnCp, tau, q, res, alpha, beta); % calculate residual for regularization 
term (s) 

varargout{l} = [r; rR] ; 7o add regularization residuals 

% f is required 
else 

f = regFcn(p, tau, q, res, alpha, beta); 7o calculate function value for 
regularization term (s) 

varargout{l} = 0.5*(r^*r) + f; 7» calculate function 

end 


7« more than r or f is required 
else 


7» calculate model function f and derivatives fs and fp 
[F, Fs, Fp] = mFcn(T, sM, p); 


7» calculate projection 
[prjsD, dprjsD] = prjFcn(sD); 
prj sD = prj sD ( : ) ; 
prjsD = prjsD(idx); 
dprjsD = dprjsD(idx , :) ; 


% calculate projection and its derivative 
7« vectorize projection 

% remove indices of unused values from the projection 
7« remove rows of derivatives due to unused values 


7» residual and Jacobian 

r = [w.*(prjsD-d) ; lambda*(dsdtM(:)-F (:))] ; 

7o combined residual of 

data and model misfit 

J = [ sparse (nidx,np), spdiags (w,0,nidx,nidx)*dprjsD*dSdQD; -lambda*Fp, lambda*(dSdQdtM - 
Fs*dSdQM)]; 7o Jacobian of data and model misfit 


7» r and J are required 
if res 

[rR, JR] = regFcnCp, tau, q, res, alpha, beta); 7» calculate residual and Jacobian 
for regularization term(s) 

varargout{I} = [r; rR]; % add regularization 

residuals 

varargout{2} = [J; JR]; % add regularization 

Jacobian 

7» more than f is required 
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else 


y» f and g are required 
if nargout == 2 

[f, g] = regFcnCp, tan, q, res, alpha, beta); % 
gradient values for regularization term(s) 
varargout{l} = 0.5*(r’*r) + f; 
varargout{2} = J’*r + g; 
y f, g, and H are required 
else 

[f, g, H] = regFcn(p, tau, q, res, alpha, beta); y 
and Hessian values for regularization terin(s) 
varargout{l} = 0.5*(r’*r) + f; 
varargout{2} = J’*r + g; 
varargout{3} = J^*J + H; 

end 


calculate function and 

y calculate function 
y calculate gradient 


calculate function, gradient. 


y calculate 
y calculate 
y Calculate 


function 

gradient 

Hessian 


end 


end 

end 

C.2 Cubic Splines 

function [s, dsdt, dsdy, dsdydt] = cubicSpline(t, y, tt) 

y 

y function [s, dsdt, dsdy, dsdydt] = cubicSpline(t, y, tt) 

y 

y Author: 

y (c) Matthias Chung (mcchung@vt.edu) 

y Justin Krueger (kruegej2@vt.edu) 

y 

y Date: February 2014 

y 

y MATLAB Version: 8.1.0.604 (R2013a) 

y 

y Description: 

y This function calculates the piecewise polynomials (cubic spline) or the values of 
y a cubic spline s with given knots t and corresponding data y(t). If the 

y third argument tt is given this function evaluates the piecewise polynomials 

y at timepoints tt. Not-a-knot boundary condition are used. 

y Additionally this function calculates derivatives ds/dt, ds/dy and d/dy(ds/dt). 
y Note, the derivatives ds/dy and d/dy(ds/dt) are independent of y and for 
y multiple dimensions of y, dsdy and d/dy(ds/dt) are of the form 
y dsdy = [B 0 ... 0 0] 

y [: : : :] 

y [0 0 ... 0 B] 

y This function just returns just the diagonal block B for efficiency. 

y 

y Input arguments : 

y t - knots of spline (row vector 1 x n+1) 

y y - interpolation values of spline (matrix m x n+1) 

y tt - evaluation points of spline (row vector 1 x k) 

y 

y Output arguments: 

y s - spline s (piecewise polynomial, evaluated at tt if tt not empty) 

y dsdt - derivative dsdt of s with respect to t (piecewise polynomial, evaluated at tt 

if tt not empty) 

y dsdy - derivative dsdy of s with respect to y (piecewise polynomial, evaluated at tt 

if tt not empty) 

y dsdydt - derivative dsdydt of s with respect to y and t (piecewise polynomial, 

evaluated at tt if tt not empty) 

y 

y Example: 

y t = linspace(0, 2 * pi, 20); 
y y = [cos(t); sin(t)]; 
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7, tt = linspace(0, 2 * pi , 100); 

7« [s, dsdt, dsdy, dsdydt] = cubicSpline(t, y); (returns piecewise polynomials) 

7« [s , dsdt, dsdy, dsdydt] = cubicSpline (t , y, tt); (returns spline values) 

7. 

7. References : 

% [1] Carl DeBoor, A Practical Guide to Splines, Reprint edition. Springer, 1994. 

7o 

7« initialize fixed options of algorithm 
n = length (t)-1; 

% check input data 
if n < 1 

error(^There must be at least two data points.’); 

end 

% initialize constants 

h = diff(t); 7» length of interval parts h 

m = size(y, 1); 7o dimension of y 

diffy = diff(y, 1, 2); % difference of interpolation points 

if n == 1 7» interpolant is a straight line 

s = mkpp(t, [diffy/h, y(:,l)], m); 
if nargout > 1 

dsdt = mkpp(t, diffy/h, m); 
if nargout > 2 

dsdy = mkpp(t, [-1, h; 1, 0 ]/h, 2)’; 
if nargout == 4 

dsdydt = mkpp(t, [-1, l]/h, 2)’; 

end 

end 

end 

elseif n == 2 7» interpolant is a parabola 

a = (diffy(:,2)/h(2) - diffy(:,l)/h(l))/(h(l)+h(2)); % quadratic coefficient 
s = mkpp(t([l,3]), [a, diffy(:,l)/h(l) - a*h(l), y(:,l)], m); 7 construct spline s 

if nargout > 1 7a construct dsdt 

dsdt = mkpp(t([1,3]) , [2*a, diffy ( : ,1)/h ( 1) - a*h(l)], m) ; 

if nargout > 2 % construct dsdy 

a = 1/(h ( 1)+h(2)) * [l/h(l); -l/h(2)-l/h(1) ; l/h(2)]; % quadradic coefficient 

dsdy = mkpp(t([l,3]),[a, l/h(l)*[-l; 1; 0] - h(l)*a, [1; 0; 0] ], 3)’; 

if nargout == 4 % construct dsdydt 

dsdydt = mkpp(t ( [1 ,3]) , [2*a, l/h(l)*[“l; 1; 0] - h(l)*a], 3)’; 

end 

end 

end 


else 7o interpolant is a regular spline 

% upper diagonal of matrix 

lambda = (h(2:n) ./(h(1:n-1)+h(2:n))) ’ ; 

7» initialize tri-diagonal matrix 

A = spdiags([[l -lambda; 0; 0], 2*ones(n+l,l), [0; 0; lambda]], -1:1, n+1, n+1); 

7» initialize right hand side of linear system 

B = [zeros(m,l), 6*(bsxfun(@rdivide,diffy(:,2:n),(h(l:n-l)+h(2:n)).*h(2:n)) - bsxfun( 
Qrdivide,diffy(:,l:n-l),(h(l:n-l)+h(2:n)).*h(l:n-l))), zeros(m,l)] ’; 

7» set not-a-knot boundary conditions 

A(l, 1:3) = [ -l/h(l), l/h(l)+l/h(2), -l/h(2)]; 

A(n+1, n-l:n+l) = [-l/h(n-l), l/h(n-1)+l/h(n), -l/h(n)]; 
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M = (A\B) ’ ; “/o solve for moments M 

H = bsxfun (©times , ones(m,l), h) ; */» distances h for each dimension of y 
y, generate piecewise polynomial 

s = mkpp(t, [dif f (M,1,2)./(6+H), M(:,l:n)/2, diffy./H - (2*M(:,1:n)+M(:,2:n+1)).*H/6, y 
(: ,l:n)] , m) ; 

if nargout > 1 7, calculate derivative dsdt 

dsdt = mkppCt, [dif f (M,1,2) . / (2*H) , M(:,l:n), diffy./H - (2*M ( : , 1:n)+M(: , 2:n + 1)) . 

/6] , m) ; y# calculate dsdt 

if nargout > 2 % calculate derivatives dsdy and dsdydt 

y, initialize Jacobian of right hand side 
mu = -6./(h(l:n-l).*h(2:n ))^; 

JB = spdiags([[-mu.* lambda; 0; 0], [0; mu; 0], [0; 0; -mu.*(l-lambda)]], -1:1, n 

+1, n+1); 

y« calculate Jacobian of the moments 
JM = full (A\JB) ; 

y, initialize constants 
H = bsxfun(©times, h’, ones(1,n+1)); 

10 = [sparse (n , 1) , speye(n)]; 

IN = speye(n, n+1); 

y« calculate coefficients 

Jc = l./H.*(I0-IN) - H/6.*(2*IN+I0)*JM; 

Jb = 1/2+IN+JM; 

Ja = l./(6*H) .*(I0-IN)*JM; 

dsdy = mkpp(t, [Ja; Jb ; Jc; full(IN)]', n + 1); "/<> calculate dsdy 

if nargout == 4 y calculate dsdydt 

dsdydt = mkpp(t, [3*Ja; 2*Jb; Jc]', n+1); 

end 

end 

end 


end 

y, evaluate piecewise polynomial at time points tt otherwise return piecwise polynomials 
if nargin > 2 7 return spline values 
s = ppval (s, tt) ; 
if nargout > 1 

dsdt = ppval(dsdt, tt); 
if nargout > 2 

dsdy = ppval(dsdy, tt)’; 
if nargout == 4 

dsdydt = ppval(dsdydt , tt)’; 

end 

end 

end 

end 

end 

C.3 Lotka-Volterra Model 

function [f , dfdy , dfdp] = 1 otkaVolterra (“■ , y, p) 

7 

7 function [f, dfdy, dfdp] = lotkaVolterra(", y, p) 

7 

7 Author: 
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(c) Matthias Chung (mcchung@vt.edu) 
Justin Krueger (kruegej2@vt.edu) 

Date: July 2014 

MATLAB Version: 8.1.0.604 (R2013a) 


Description : 

The Lotka-Volterra differential equation 
y’ = diag(y)(r+Ay) 

with nf species (length of r is nf and dimension of A is nf x nf). 
A and r are collected in the parameter p = [r; vec(A)]. 

Input arguments: 

- time (no time dependence, autonomous system) 
y - value at time(s) t 

p - parameters (length nf'‘2 + nf) 

Output arguments: 

f - first output argument is the model function 

f = [y_l ’;■■■; y_nf ’] 

dfdy - second output argument is df/dy 

dfdp - third output argument is df/dp 


f = [f_l(t_l) 

[ ... 

[f _nf(t_ 1) 


f_1(t_nt) ] 

] 

f_nf(t_nt)] 


Dimension: (nf*nt) x (nf*nt) 

dfdy = [df/dy(t_l) 0 ... 0] 

[0 ... ... ... 0 ] 

[0 ... 0 df/dy(t_nt)] 


where df/dy(t) = [df_l/dy_l(t) df_l/dy_2(t) 

[ . . . 

[df_nf/dy_1 (t) df_nf/dy_2(t ) 


df_1/dy_nf (t) ] 

] 

df_nf/dy_nf (t)] 


Dimension: (nf*nt) x (np) 
dfdp = [df/dp(t_l) ] 

[ ... ] 

[df/dp(t_nt)] 


where df/dp(t) 


[df_1/dp_1(t) df_l/dp_2(t) 

[ 

[df_nf/dp_1(t) df_nf/dp_2(t) 


df_1/dp_np(t) ] 

] 

df _nf/dp_np(t)] 


and p = [r; vec(A)] 

To expand: use objects to calculate dfdp and dfdy. 


Example: 

[f, dfdy, dfdp] = lotkaVolterra(" 


[4 6] 


[ 2231 - 12 ]’) 


References : 


7« record dimensions of y 
[nf , nt] = size (y) ; 


% decompose parameters 

r = p(l:nf); 7« intrinsic growth vector 

A = reshape (p (nf+1: end) , nf , nf ) ; 7« interaction matrix 

7« model function 
f = DYDT(y, r, A, nf, nt); 
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’/« df/dy required 
if nargout > 1 

dfdy = DFDY(y, r. A, nf , nt) ; “Z# derivative with respect to the state 
y, df/dp required 
if nargout > 2 

dfdp = DFDP(y, nf, nt); % derivative with respect to the parameters 

end 

end 

end 

% - 

% Lokta-Volterra model f 

function f = DYDT(y, r, A, nf, nt) 

f = y.*(bsxfun(@times ,r,ones(nf , nt ) ) + A*y) ; 

end 


% - 

y« derivative of f respect to y 
function dfdy = DFDY(y, r. A, nf, nt) 
m = nf *nt; 

dfdy = spdiags (repmat(r,nt,l),0,m,m) + spdiags(y(:),0,m,m)*kron(speye(nt,nt),A) + spdiags( 
reshape(A*y,m,l) ,0,m,m); 

end 


% - 

y« derivative of f respect to p 
function dfdp = DFDP(y, nf, nt) 
m = nf *nt; 

dfdr = spdiags(y(:),0,m,m) * repmat (speye(nf ,nf) ,nt ,1) ; 
dfdA = spdiags(y(:),0,m,m)*kron(y’,speye(nf,nf)); 
dfdp = [dfdr, dfdA]; 
end 
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